Sex seems to be the determinng factor in hierarchical clustering. Neither disease status nor ethnicity seem to be clustered in any meaningful manner. Also, one sample seems to have a mismatched ‘sex’ label.
Heatmap with the same clustering. Highly distant groups in rows are separated by sex.
It would take 119 principal components to capture 90% of variance in the data.
Sex by colors. Disease status by shape. The sample with a mismatched ‘sex’ label is visible here too.
The heatmap of first 7 PCs.
Seems like the only relevant characterictis separated by first 5 PCs is sex.
The lighter the point, the higher the age.